Stochastic Differential Equations

This brief notebook is based on the article of D Higham, An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations, SIAM Review 43:525-546 (2001). Higham provides Matlab codes illustrating the basic ideas at http://personal.strath.ac.uk/d.j.higham/algfiles.html. This is a Python reimplementation.

Wiener processes

We start by looking at the behaviour of a Wiener process, a normally distributed random variable.


In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

In [2]:
T = 1.0 # End time
N = 500 # Number of steps
dt = T / N
t = np.linspace(0, T, N)

W = np.zeros(N)
dW = np.zeros(N)

dW[0] = np.random.randn()
W[0] = dW[0]

# This defines the Wiener process, or Brownian motion
for j in range(1, N):
    dW[j] = np.sqrt(dt) * np.random.randn()
    W[j] = W[j-1] + dW[j]

plt.plot(t, W)
plt.xlabel(r'$t$')
plt.ylabel(r'$W(t)$')
plt.show()


Of course, that method of constructing the process is notably inefficient (thanks to the loop). We will vectorize it first.


In [3]:
T = 1.0 # End time
N = 500 # Number of steps
dt = T / N
t = np.linspace(0, T, N)

dW = np.sqrt(dt) * np.random.randn(N) # All increments are independent, so initialize the lot in one go
W = np.cumsum(dW) # The Brownian motion is the cumulative sum of all increments

plt.plot(t, W)
plt.xlabel(r'$t$')
plt.ylabel(r'$W(t)$')

plt.show()


A single path by itself is not very useful. Instead we should look at an ensemble of paths, and of functions of paths.

The function we choose is $U = \exp \left[ t + \tfrac{W}{2} \right]$ where $W$ is the Brownian path.


In [4]:
T = 1.0 # End time
N = 500 # Number of steps
dt = T / N
t = np.linspace(0, T, N)

M = 1000 # Will construct M paths simultaneously

dW = np.sqrt(dt) * np.random.randn(M, N) # Now construct the increments for all points and paths
W = np.cumsum(dW, 1) # Take the cumulative sum for each path separately

U = np.exp(t + 0.5*W) # Construct a function of the Brownian path
Umean = np.mean(U, 0) # Take the mean across all paths

# Compare against the expected average solution
print("Average error is {}".format(np.linalg.norm((Umean - np.exp(9*t/8)), np.inf)))

plt.plot(t, Umean, 'b-')
for i in range(5):
    plt.plot(t, U[i, :], 'r--')
plt.xlabel(r'$t$')
plt.ylabel(r'$U(t)$')
plt.legend(('Mean of all paths', 'Some individual paths'), loc = 'upper left')

plt.show()


Average error is 0.06845574068801419

Stochastic integrals

So we can define and compute the random Brownian process $W$ and construct functions of it, averaging over the various realisations, or paths, to get a limiting value.

Next we want to compute the integral of a function of a random process, $\int_0^T h(t) dW(t)$. There are two standard approaches: these given different answers.

We compare the different Ito and Stratonovich approaches below, simply applied to computing $\int_0^T W dW(t)$.


In [5]:
T = 1.0
N = 500
dt = T / N

dW = np.sqrt(dt) * np.random.randn(N)
W = np.cumsum(dW)

shiftedW = np.zeros(N)
shiftedW[1:] = W[:-1]

# The two different integrals. 
# The Ito integral is roughly the Riemann integral evaluated at the left edge of the subinterval
ito = np.sum(shiftedW*dW)
# The Stratonovich integral is roughly the Riemann integral evaluated at the centre of the subinterval
stratonovich = np.sum((0.5*(shiftedW+W) + 0.5*np.sqrt(dt)*np.random.randn(N))*dW)

# Note that the exact solutions are different - markedly so!
print("The Ito integral is {} with error {}".format(ito, np.abs(ito - 0.5*(W[-1]**2-T))))
print("The Stratonovich integral is {} with error {}".format(stratonovich, np.abs(stratonovich - 0.5*W[-1]**2)))


The Ito integral is 1.9889762225064613 with error 0.018325808690833778
The Stratonovich integral is 2.469889754991789 with error 0.0007606588238386891

Euler-Maruyama Method

We now look at solving Stochastic differential equations. We will write these in the form

\begin{equation} dX(t) = f(X(t)) dt + g(X(t)) dW(t). \end{equation}

The initial conditions are $X(0) = X_0$, and we solve for $0 \le t \le T$.

The simplest extension of the standard Euler method for solving initial value problems for an ODE is the Euler-Maruyama method. Here we apply that to the simple linear SDE \begin{equation} dX(t) = \lambda X(t) + \mu X(t) dW(t). \end{equation} For definiteness we choose $\lambda = 2, \mu = 1, X_0 = 1, T = 1$.


In [7]:
lam = 2.0
mu  = 1.0
X0  = 1.0
T   = 1.0

N = 2**8
dt = T / N
t = np.linspace(0, T, N)

dW = np.sqrt(dt) * np.random.randn(N)
W = np.cumsum(dW)

# The "true" (average, or expected) solution
Xtrue = X0 * np.exp((lam - 0.5*mu**2)*t + mu*W)

# Downsample the number of points.
# Note that we average the increments.
R = 4
Dt = R * dt
L = int(N / R)
tt = np.linspace(0, T, L)

# Initial data
Xem = np.zeros(L)
Xem[0] = X0

# Euler-Maruyama applied to this SDE
Xtemp = X0
for j in range(L):
    Winc = sum(dW[R*j:R*(j+1)])
    Xtemp += (Dt * lam + mu * Winc) * Xtemp
    Xem[j] = Xtemp
# Note that this is resetting, or overwriting, the initial data point.
# I find this slightly confusing.

print("Error at the endpoint is {}".format(np.abs(Xem[-1] - Xtrue[-1])))

plt.plot(t, Xtrue, 'm-')
plt.plot(tt, Xem, 'r--*')
plt.xlabel(r'$t$')
plt.ylabel(r'$X$')
plt.legend(('Expected solution', 'Realized solution'), loc = 'upper left')

plt.show()


Error at the endpoint is 0.10349624437248517

Convergence

As ever with a numerical method we need to check its convergence behaviour. What does this mean for a Stochastic DE?

There are two types of convergence measure of a random variable: strong and weak. For strong convergence we expect the random variable (or function) to converge, on average, pointwise: \begin{equation} \lim E| X(t_n) - X_n| \to 0. \end{equation} The convergence rate, $\gamma$, is defined to be the smallest constant such that, for constant $C$, \begin{equation} E| X(t_n) - X_n | \le C \Delta t^{\gamma} \end{equation} when the timestep $\Delta t$ is sufficiently small. Measuring this at a single point, such as the endpoint $t = T, n = N$ gives us a single number to check.

Weak convergence just says that the average error converges, which is \begin{equation} \lim | E(X(t)) - E(X_n) | \to 0. \end{equation} This is measured in a similar fashion using the convergence rate \begin{equation} | E(X(t)) - E(X_n) | \le C \Delta t^{\gamma}. \end{equation}

We test this on the linear SDE above.


In [9]:
lam = 2.0
mu  = 1.0
X0  = 1.0
T   = 1.0

N = 2**9 # Number of steps at finest resolution
dt = T / N

M = 1000 # Number of paths to sample

X_strong_error = np.zeros((M, 5))
Dtvals = np.zeros(5)

for s in range(M):
    dW = np.sqrt(dt) * np.random.randn(N)
    W = np.cumsum(dW)
    Xtrue = X0 * np.exp((lam - 0.5*mu**2)*T + mu*W[-1])
    for p in range(5):
        R = 2**p
        Dt = R * dt
        Dtvals[p] = Dt
        L = int(N / R)
        Xtemp = X0
        for j in range(L):
            Winc = np.sum(dW[R*j:R*(j+1)])
            Xtemp += (Dt*lam + mu*Winc)*Xtemp
        X_strong_error[s,p] = np.abs(Xtemp - Xtrue)
        
strong_error = np.mean(X_strong_error, 0)

# Measure the convergence rate
p = np.polyfit(np.log(Dtvals), np.log(strong_error), 1)
print("Measured convergence rate is {}".format(p[0]))

plt.loglog(Dtvals, strong_error, 'b*-')
plt.loglog(Dtvals, np.exp(p[1])*np.sqrt(Dtvals), 'r--')
plt.xlabel(r'$\Delta t$')
plt.ylabel(r'Sample average of $|X(t) - X_L|$')
plt.legend((r'Measured errors', r'Expected convergence rate $\gamma = \frac{1}{2}$'), loc = 'upper left')

plt.show()


Measured convergence rate is 0.525663403149245

For the weak convergence test we have to apply it to a lot more paths to get something sensible.


In [10]:
lam = 2.0
mu  = 1.0
X0  = 1.0
T   = 1.0

N = 2**9 # Number of steps at finest resolution
dt = T / N

M = 100000 # Number of paths to sample

X_weak = np.zeros(5)
Dtvals = np.zeros(5)

for p in range(5):
    R = 2**p
    Dt = R * dt
    Dtvals[p] = Dt
    L = int(N / R)
    Xtemp = X0 * np.ones(M)
    for j in range(L):
        Winc = np.sqrt(Dt)*np.random.randn(M)
        Xtemp += (Dt*lam + mu*Winc)*Xtemp
    X_weak[p] = np.mean(Xtemp)
weak_error = np.abs(X_weak - np.exp(lam))
        
# Measure the convergence rate
p = np.polyfit(np.log(Dtvals), np.log(weak_error), 1)
print("Measured convergence rate is {}".format(p[0]))

plt.loglog(Dtvals, weak_error, 'b*-')
plt.loglog(Dtvals, np.exp(p[1])*Dtvals, 'r--')
plt.xlabel(r'$\Delta t$')
plt.ylabel(r'$|E(X(t)) -$ Sample average of $X_L|$')
plt.legend((r'Measured errors', r'Expected convergence rate $\gamma = 1$'), loc = 'upper left')

plt.show()


Measured convergence rate is 1.093493547736591

Higher order - Milstein's method

As the results above show, the convergence rate of the Euler-Maruyama method are not wonderful. Higher order methods for SDEs can be constructed using Ito's Lemma - the stochastic version of Taylor's theorem - in the same way that, for example, standard Runge-Kutta methods are constructed.

Here we follow Higham by applying Milstein's method to the SDE \begin{equation} dX = r X (k - X) dt + \beta X dW(t) \end{equation} with initial data $X(0) = X_0 = 0.5$ and parameter values $r = 2, k = 1, \beta = 1/4$. Checking the strong convergence rate shows the improvements with this method.


In [12]:
r = 2.0
K = 1.0
beta = 0.25
X0 = 0.5
T = 1.0

N = 2**11
dt = T / N

M = 500

dW = np.sqrt(dt) * np.random.randn(M, N)
X_mil = np.zeros((M, 5))

Dtvals = np.zeros(5)

# We will do the comparison against a reference solution
R = [1, 16, 32, 64, 128]
for p in range(5):
    Dt = R[p]*dt
    Dtvals[p] = Dt
    L = int(N / R[p])
    Xtemp = X0 * np.ones(M)
    for j in range(L):
        Winc = np.sum(dW[:, R[p]*j:R[p]*(j+1)],1)
        Xtemp += Dt*r*Xtemp*(K-Xtemp) + beta*Xtemp*Winc + 0.5*beta**2*Xtemp*(Winc**2 - Dt)
    X_mil[:, p] = Xtemp

X_ref = X_mil[:, 0]
strong_error = np.zeros(4)
for i in range(1, 5):
    strong_error[i-1] = np.mean(np.abs(X_mil[:, i] - X_ref))

# Measure the convergence rate
p = np.polyfit(np.log(Dtvals[1:]), np.log(strong_error), 1)
print("Measured convergence rate is {}".format(p[0]))

plt.loglog(Dtvals[1:], strong_error, 'b*-')
plt.loglog(Dtvals[1:], np.exp(p[1])*Dtvals[1:], 'r--')
plt.xlabel(r'$\Delta t$')
plt.ylabel(r'Sample average of $|X(t) - X_L|$')
plt.legend((r'Measured errors', r'Expected convergence rate $\gamma = 1$'), loc = 'upper left')

plt.show()


Measured convergence rate is 1.0320542737922012

Linear Stability

Finally, we look at the linear stability of the numerical methods. In particular we focus on the Euler-Maruyama method applied to the simple SDE \begin{equation} dX = \lambda X dt + \mu X dW(t) \end{equation} with initial data $X(0) = X_0 = 1$ and parameters $\lambda = -3, \mu = \sqrt{3}$.

Testing linear stability is when $\lim_{t \to \infty} |X(t)| \to 0$, which depends on the timestep taken.


In [14]:
# This is the mean square case

T = 20.0
M = 50000
X0 = 1.0
lam = -3.0
mu = np.sqrt(3.0)

dt = 0.25
N = 80
plt.subplot(2,1,1)
for k in range(3):
    Dt = 2**k * dt
    L = int(N / 2**k)
    Xms = np.zeros(L)
    Xtemp = X0 * np.ones(M)
    t = np.linspace(0, T, L)
    for j in range(L):
        Winc = np.sqrt(Dt) * np.random.randn(M)
        Xtemp += (Dt*lam + mu*Winc)*Xtemp
        Xms[j] = np.mean(Xtemp**2)
    plt.semilogy(t, Xms)
plt.legend((r'$\Delta t = \frac{1}{4}$', r'$\Delta t = \frac{1}{2}$', r'$\Delta t = 1$'), loc = 'lower left')
plt.title(r'Mean-Square: $\lambda = -3, \mu = \sqrt{3}$')
plt.ylabel(r'$E[X^2]$')

# This is a single asymptotic path
T = 500.0
plt.subplot(2,1,2)
for k in range(3):
    Dt = 2**k * dt
    L = int(N / 2**k)
    Xemabs = np.zeros(L)
    Xtemp = X0
    t = np.linspace(0, T, L)
    for j in range(L):
        Winc = np.sqrt(Dt) * np.random.randn(1)
        Xtemp += (Dt*lam + mu*Winc)*Xtemp
        Xemabs[j] = np.abs(Xtemp)
    plt.semilogy(t, Xemabs)
plt.legend((r'$\Delta t = \frac{1}{4}$', r'$\Delta t = \frac{1}{2}$', '$\Delta t = 1$'), loc = 'lower left')
plt.title(r'Single Path: $\lambda = -3, \mu = \sqrt{3}$')
plt.ylabel(r'$|X|$')

plt.show()